Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            The ability to connect the form and meaning of a concept, known as word retrieval, is fundamental to human communication. While various input modalities could lead to identical word retrieval, the exact neural dy- namics supporting this process relevant to daily auditory discourse remain poorly understood. Here, we re- corded neurosurgical electrocorticography (ECoG) data from 48 patients and dissociated two key language networks that highly overlap in time and space, critical for word retrieval. Using unsupervised temporal clus- tering techniques, we found a semantic processing network located in the middle and inferior frontal gyri. This network was distinct from an articulatory planning network in the inferior frontal and precentral gyri, which was invariant to input modalities. Functionally, we confirmed that the semantic processing network en- codes word surprisal during sentence perception. These findings elucidate neurophysiological mechanisms underlying the processing of semantic auditory inputs ranging from passive language comprehension to conversational speech.more » « lessFree, publicly-accessible full text available May 1, 2026
- 
            Abstract Objective: This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e., Electrocorticographic or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface (ECoG) and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements. The model should not have subject-specific layers, and the trained model should perform well on participants unseen during training. Approach: We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train subject-specific models using data from a single participant and multi-subject models exploiting data from multiple participants. Main Results: The subject-specific models using only low-density 8x8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC=0.817), over N=43 participants, significantly outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (N=39) led to further improvement (PCC=0.838). For participants with only sEEG electrodes (N=9), subject-specific models still enjoy comparable performance with an average PCC=0.798. A single multi-subject model trained on ECoG data from 15 participants yielded comparable results (PCC=0.837) as 15 models trained individually for these participants (PCC=0.831). Furthermore, the multi-subject models achieved high performance on unseen participants, with an average PCC=0.765 in leave-one-out cross-validation. Significance: The proposed SwinTW decoder enables future speech decoding approaches to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. The success of the single multi-subject model when tested on participants within the training cohort demonstrates that the model architecture is capable of exploiting data from multiple participants with diverse electrode placements. The architecture’s flexibility in training with both single-subject and multi-subject data, as well as grid and non-grid electrodes, ensures its broad applicability. Importantly, the generalizability of the multi-subject models in our study population suggests that a model trained using paired acoustic and neural data from multiple patients can potentially be applied to new patients with speech disability where acoustic-neural training data is not feasible.more » « less
- 
            Across the animal kingdom, neural responses in the auditory cortex are suppressed during vocalization, and humans are no exception. A common hypothesis is that suppression increases sensitivity to auditory feedback, enabling the detection of vocalization errors. This hypothesis has been previously confirmed in non-human primates, however a direct link between auditory suppression and sensitivity in human speech monitoring remains elusive. To address this issue, we obtained intracranial electroencephalography (iEEG) recordings from 35 neurosurgical participants during speech production. We first characterized the detailed topography of auditory suppression, which varied across superior temporal gyrus (STG). Next, we performed a delayed auditory feedback (DAF) task to determine whether the suppressed sites were also sensitive to auditory feedback alterations. Indeed, overlapping sites showed enhanced responses to feedback, indicating sensitivity. Importantly, there was a strong correlation between the degree of auditory suppression and feedback sensitivity, suggesting suppression might be a key mechanism that underlies speech monitoring. Further, we found that when participants produced speech with simultaneous auditory feedback, posterior STG was selectively activated if participants were engaged in a DAF paradigm, suggesting that increased attentional load can modulate auditory feedback sensitivity.more » « less
- 
            When we vocalize, our brain distinguishes self-generated sounds from external ones. A corollary discharge signal supports this function in animals; however, in humans, its exact origin and temporal dynamics remain unknown. We report electrocorticographic recordings in neurosurgical patients and a connectivity analysis framework based on Granger causality that reveals major neural communications. We find a reproducible source for corollary discharge across multiple speech production paradigms localized to the ventral speech motor cortex before speech articulation. The uncovered discharge predicts the degree of auditory cortex suppression during speech, its well-documented consequence. These results reveal the human corollary discharge source and timing with far-reaching implication for speech motor-control as well as auditory hallucinations in human psychosis.more » « less
- 
            Decoding human speech from neural signals is essential for brain–computer interface (BCI) technologies that aim to restore speech in populations with neurological deficits. However, it remains a highly challenging task, compounded by the scarce availability of neural signals with corresponding speech, data complexity and high dimensionality. Here we present a novel deep learning-based neural speech decoding framework that includes an ECoG decoder that translates electrocorticographic (ECoG) signals from the cortex into interpretable speech parameters and a novel differentiable speech synthesizer that maps speech parameters to spectrograms. We have developed a companion speech-to-speech auto-encoder consisting of a speech encoder and the same speech synthesizer to generate reference speech parameters to facilitate the ECoG decoder training. This framework generates natural-sounding speech and is highly reproducible across a cohort of 48 participants. Our experimental results show that our models can decode speech with high correlation, even when limited to only causal operations, which is necessary for adoption by real-time neural prostheses. Finally, we successfully decode speech in participants with either left or right hemisphere coverage, which could lead to speech prostheses in patients with deficits resulting from left hemisphere damage.more » « less
- 
            This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e., Electrocorticographic or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface (ECoG) and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements and the trained model should perform well on participants unseen during training. Approach We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes, by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train both subject-specific models using data from a single participant as well as multi-patient models exploiting data from multiple participants. Main Results The subject-specific models using only low-density 8x8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC=0.817), over N=43 participants, outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (N=39) led to further improvement (PCC=0.838). For participants with only sEEG electrodes (N=9), subject-specific models still enjoy comparable performance with an average PCC=0.798. The multi-subject models achieved high performance on unseen participants, with an average PCC=0.765 in leave-one-out cross-validation. Significance The proposed SwinTW decoder enables future speech neuropros-theses to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. Importantly, the generalizability of the multi-patient models suggests the exciting possibility of developing speech neuropros-theses for people with speech disability without relying on their own neural data for training, which is not always feasible.more » « less
- 
            Speech production is a complex human function requiring continuous feedforward commands together with reafferent feedback processing. These processes are carried out by distinct frontal and temporal cortical networks, but the degree and timing of their recruitment and dynamics remain poorly understood. We present a deep learning architecture that translates neural signals recorded directly from the cortex to an interpretable representational space that can reconstruct speech. We leverage learned decoding networks to disentangle feedforward vs. feedback processing. Unlike prevailing models, we find a mixed cortical architecture in which frontal and temporal networks each process both feedforward and feedback information in tandem. We elucidate the timing of feedforward and feedback–related processing by quantifying the derived receptive fields. Our approach provides evidence for a surprisingly mixed cortical architecture of speech circuitry together with decoding advances that have important implications for neural prosthetics.more » « less
- 
            When we vocalize, our brain distinguishes self-generated sounds from external ones. A corollary discharge signal supports this function in animals, however, in humans its exact origin and temporal dynamics remain unknown. We report Electrocorticographic (ECoG) recordings in neurosurgical patients and a novel connectivity approach based on Granger-causality that reveals major neural communications. We find a reproducible source for corollary discharge across multiple speech production paradigms localized to ventral speech motor cortex before speech articulation. The uncovered discharge predicts the degree of auditory cortex suppression during speech, its well-documented consequence. These results reveal the human corollary discharge source and timing with far-reaching implication for speech motor-control as well as auditory hallucinations in human psychosis.more » « less
- 
            Abstract Objective. The force that an electrocorticography (ECoG) array exerts on the brain manifests when it bends to match the curvature of the skull and cerebral cortex. This force can negatively impact both short-term and long-term patient outcomes. Here we provide a mechanical characterization of a novel liquid crystal polymer (LCP) ECoG array prototype to demonstrate that its thinner geometry reduces the force potentially applied to the cortex of the brain. Approach. We built a low-force flexural testing machine to measure ECoG array bending forces, calculate their effective flexural moduli, and approximate the maximum force they could exerted on the human brain. Main results. The LCP ECoG prototype was found to have a maximal force less than 20% that of any commercially available ECoG arrays that were tested. However, as a material, LCP was measured to be as much as 24× more rigid than silicone, which is traditionally used in ECoG arrays. This suggests that the lower maximal force resulted from the prototype’s thinner profile (2.9×–3.25×). Significance. While decreasing material stiffness can lower the force an ECoG array exhibits, our LCP ECoG array prototype demonstrated that flexible circuit manufacturing techniques can also lower these forces by decreasing ECoG array thickness. Flexural tests of ECoG arrays are necessary to accurately assess these forces, as material properties for polymers and laminates are often scale dependent. As the polymers used are anisotropic, elastic modulus cannot be used to predict ECoG flexural behavior. Accounting for these factors, we used our four-point flexure testing procedure to quantify the forces exerted on the brain by ECoG array bending. With this experimental method, ECoG arrays can be designed to minimize force exerted on the brain, potentially improving both acute and chronic clinical utility.more » « less
- 
            Abstract ObjectiveEffective surgical treatment of drug‐resistant epilepsy depends on accurate localization of the epileptogenic zone (EZ). High‐frequency oscillations (HFOs) are potential biomarkers of the EZ. Previous research has shown that HFOs often occur within submillimeter areas of brain tissue and that the coarse spatial sampling of clinical intracranial electrode arrays may limit the accurate capture of HFO activity. In this study, we sought to characterize microscale HFO activity captured on thin, flexible microelectrocorticographic (μECoG) arrays, which provide high spatial resolution over large cortical surface areas. MethodsWe used novel liquid crystal polymer thin‐film μECoG arrays (.76–1.72‐mm intercontact spacing) to capture HFOs in eight intraoperative recordings from seven patients with epilepsy. We identified ripple (80–250 Hz) and fast ripple (250–600 Hz) HFOs using a common energy thresholding detection algorithm along with two stages of artifact rejection. We visualized microscale subregions of HFO activity using spatial maps of HFO rate, signal‐to‐noise ratio, and mean peak frequency. We quantified the spatial extent of HFO events by measuring covariance between detected HFOs and surrounding activity. We also compared HFO detection rates on microcontacts to simulated macrocontacts by spatially averaging data. ResultsWe found visually delineable subregions of elevated HFO activity within each μECoG recording. Forty‐seven percent of HFOs occurred on single 200‐μm‐diameter recording contacts, with minimal high‐frequency activity on surrounding contacts. Other HFO events occurred across multiple contacts simultaneously, with covarying activity most often limited to a .95‐mm radius. Through spatial averaging, we estimated that macrocontacts with 2–3‐mm diameter would only capture 44% of the HFOs detected in our μECoG recordings. SignificanceThese results demonstrate that thin‐film microcontact surface arrays with both highresolution and large coverage accurately capture microscale HFO activity and may improve the utility of HFOs to localize the EZ for treatment of drug‐resistant epilepsy.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
